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1. INTRODUCTION 

This document provides a functional description of the CRl graphics board. 

The reader will get to peruse the following exciting tales: 

• overview 

• block diagram 

• interfaces - what comes on and off the board 

• Geometry Engine (GE5) 

• GRFl fixes 

• REl and frame buffer memory arganization 

• XMAPs and color lookup tables 

• hardware cutsot & RAMDACs 

• di^lay state machine 

2. OVERVIEW 

Refer to Figure 1, the block diagram. 

The GRl performs the graphics functions for the ^tipse system. The host provides the GRl wifii 
descriptions of 2D and 3D objects, and die GRl takes care of drawing these objects and displaying them on 
the screen. The descriptions take the form of Gr^hics Library conunands and world coor^ate vertex 
data, and they describe the object's geometric position, color, and surface normal vectors used for lighting 
calculations. The GRl board performs transformations and otho’ graphics operations to calculate specific 
pixel values for each of the 1.3 million pixels on die 1280 by 1024 high resolution monitor. These pixel 
values are stored in the video RAM firame buffer and displayed on the monitor at a refi^ rate of 60 Hz. 

The GRl consists of three sections: the Geometry Subsyst^, the Rasta- Subsystem, the Display 
Subsystem. The Geometry Subsystem (called the GES module for Geometry Engine S) includes the 
interface to the host as well as the floating point graphics compuuuion engine. Commands and data fiom 
the host are examined by the GES and rout^ to one of two places. Display commands such as color map 
or cursor data are sent out ova- the display bus to update the appropriate location. Drawing commands for 
the geometry engine are stuffed in a fifo to even the flow between the host and the GES. While commands 
remain in die fifo, the GES will read those commands and their associated data out of the fifo and perform 
the necessary calculations. The Rasta Subsystem knows how to draw points, lines and spans (horizontal 
lines). Therefore, the GES breaks down world coadinate graphics primitives received fiom the host to the 
level of points, lines or spans described in screen coordinates. (Ihimitives fiom the host include points, 
lines, convex and concave polygons, mesh triangles, characters, splines, NURBS.) These simple poinl/line 
primitives are then passed to the Rasta Subsystem for scan conversion. 

The GES also maintains the status of the current graphics context This status includes a 4 by 4 matrix 
stack for coordinate data transformations, a 3 by 3 matrix stack for surface normal transfromations, and 
light source information such as position, direction and intensity. Wheneva the current graphics context is 
replaced with a new context the GES passes the current status informoion to the host, and replaces it with 
new information fixim the host 
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The Raster Subsystem scan converts lines (random or horizontal spans) into pixel values at each point on 
the screen, and controls all memory timing for writing these values into the frame buffer. The GRl board 
includes 12 bitplanes of frame buffer on board. However, the Raster Subsystem includes ail the necessary 
signals for expansion to 32 total biqtlanes of frame buffer, and 24 planes of Z buffer. One optional 
daughter board (the BP4 board) may be plugged inb) the top of the GRl to add 20 bitplanes of frame 
buffer, while another optional daughter board (the ZB3) adds the 24 planes of Z buffer. The Raster 
Subsystem includes hardware sun>ort for stippling lines, patterning horizontal spans, anti-aliasing color 
index lines, and dithering 12 bit RGB pictures. The frame buffer is broken up into image bitplanes, 
overlay/underiay bitplanes, and window ID biqplanes. Image bitplanes hold the color image to be 
displayed on the screen. In the 12 bitplane system, 8 bitplanes are provided for an 8 bit single buffered 
color index image, or a 4 bit double buffered color index image. Two bitplanes are dedicated for overlay 
or underlay applications such as pop-up menus and background colors. The last two bitplanes are for 
Mondow ID providing hardware support for a windowed environmenL 24 bitplanes of image, 4 bitplanes of 
overlay/underlay, and 4 bitplanes of window ID are supported in the fully loaded system. 

The Display Subsystem accqjts data from the serial output ports of the frame buffer and appropriately 
maps these outputs depending upon display mode before acuity sending the pixels to be displayed on the 
monitor. The window IDs found in the frame buffer WID biq)l3nes tell the XMAP devices whether to 
interpret the data in the image biqilanes as single buffered or double buffered, color index or RGB images. 
The XMAP chips multiplex the image data appropriately and conditionally route it through the color maps 
such that the correct 24 bit values are fed to the RAMDAC inputs. The overlay/underlay biQ)lanes override 
the standard 24 bit values inside the XMAPs whenever an overlay or underlay is required. A hardware 
cursor is provided which ov^des the standard 24 bit values inside the RAMDACs to assure the cursor 
gets highest priority. 

The Display State Machine provides timing for the five pixel pipes (video RAM shift registers, XMAP, 
hardware cursor), and the monitor sync and blank. The timing changes with different kinds of monitors 
used Four monitor timings come standard on this board. The four are: 1) 1280 by 1024 pixel non¬ 
interlaced 60Hz; 2) 1280 by 1024 pixel interlaced 30Hz; 3) 645 by 485 pixel RS170 30Hz; 4) 780 by 575 
pixel EURO 30Hz. To support a timing not listed above, one of the oscillators must be swapped out and 
the Display State Machine PROM must be reprogrammed. The Display State Machine generates all sync, 
blank, load, clock signals for the various standards, as well as generating data transfer requests to the 
Raster Subsystem, and a vertical interrupt to the host. 

3. BLOCK DIAGRAM 

Refer to Figure 1. The following breaks down each subsystem of the block diagram. 

The Geometry Subsystem consists of: 

• HQl Gate Array: This chip provides control for the interface to the host, control for three burst DMA 
channels, and control for fte geometry engine (GE5) floating point compute module. On the host 
interface side, the GRl acts as a slave device on the interface bus. The whole Geometry Subsystem 
runs off the 10 MHz I/O clock provided by the host, such that the host interface may run completely 
synchronously. The HQl monitors the bus for host accesses, and decodes the addresses when an 
access arrives, sending strobes to the destination devices. The HQl handshakes with the host to extend 
cycles for slower devices or speed cycles for faster devices. 

Bidirectional burst DMA transfers may occur between three sqjerate pairs of endpoints. The host may 
perform read or write burst DMA transfers between its own main memory and the GES data RAM, or 
between its own main memory and the frame (or Z) buffer. The third DMA channel is between the 
GES data RAM and the frame (or Z) buffer. The HQl provides handshake control and a special GES 
microinstruction in order to operate the burst DMA channels. (AU burst DMAs are performed under 
GES microcode supervision.) 

The HQl acts as the microcontroller for the GES. It controls all program flow and all data movement 
across the GES data bus. Inside the HQl is a program control unit which allows branching and 
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subroudning, pointer control units for addressing the GES data ram and the regist^ bank inside the REl 
gate array, and a data bus management unit which controls output enables on the GE5 data bus, on the 
host interface bus, and on the display bus. Stall control is incorporated on the HQl which stalls GE5 
microcode allowing real-Ume host accesses to the GES data ram at any dme during GES microcode 
execution. The microcode may pass flags and other information to the host through the data ram. 
(Note, with the current REl, a host read during a burst DMA £nom frame buffer to GES data ram will 
cause microcode to hang. The RE2 will fix this problem. See GRFl sections below for current fixes.) 

• Display Bus Buffer; This is an eight bit transceiver which allows data to pass between die host 
interface and the display bus. Only the low eight bits off the host interface bus are passed on to the 
Display Bus. 

» GES Data Bus Buffer This is four eight bit transceivers allowing data to be passed between the GES 
bus and the host interface. Host accesses to entities on the GES data bus (e.g. data ram, microcode 
lam) move the data through these transceivers. Also, burst transfers that involve the host pass the data 
through this route. 

• FIFO: This is a S12 by 40 fifo. Thirty-two bits of data and 8 bits of address are shoved into this FIFO. 
The 8 address bits are the 8 LSBs used on the host interface bus and are used as a tag for the HQl. 
Sixteen of the 256 possible tags will force the HQl to switch to a new graphics context on the next read 
fiom fifo. The other 240 possible tags provide microcode jump addresses for the graphics commands 
passed down from the host These tags are used in conjuntion with the FETCH instruction of the 
microcode. When simply data is being read from the fifo, the thirty-two bits of data are damped onto 
the GE5 data bus. The Mo is used to even the flow of commands fiom the host, and execution by the 
GES. Some commands take longer than others to execute on the GES; the inclusion of a fifo allows the 
host to continue to write down graphics commands independent of what the GES is currently up to. 

• Microcode RAM; Th^ are 16K 40-bit words of microcode RAM. Address into the microcode RAM 
is generated by the HQl. Some microcode instructions are vertical in that they occupy two locations in 
this microstore. Most highly utilized microinsttuctions fit into a single 40 bit micro-word. Ouqiutfiom 
the microstore controls the operation of the Weitek 3132, and feeds back to the HQl to help decide 
what is the next miCToinstniction and data bus transaction. During reset, the host loads all the 
microcode before unstalling the HQl. 

• Microcode RAM buffer: This is the data path for the host to load the microcode before operation 
begins. 

• Data RAM: 8K 32-bit words of data RAM are jaovided. Data RAM is used to store constants and 
variables needed in the geometry calculations. 

• Weitek 3132: This is a floating point data path chip. It provides pipelined floating point multiply and 
adds and 32 working registers within which to do the computations. All geometry and lighting 
transformations are performed in this chip. It is the core of the GES. Ref» to the 3132 data sheet 

• GRFl Gate Array; The sole purpose of this gate array is to fix other problems in the system. It solves 
four problems. The first is within the HQl. When a host access comes in while microcode is executing 
a read fiom fifo instruction, incmrect handshakes are passed back to the host The GRFl monitors this 
condition and corrects the handshake. The next three problems are all related to misinterpretation of 
the burst DMA handshakes between the HQl and the REl. The fixes will be described in more detail 
below, but essentially the GRFl corrects the handshakes such that all burst DMA transfers operate 
correctly, and provides two flag bits which the host may read off the host interface bus. The up and 
coming RE2 will solve the last three problems, while the first problem must 1) continue to use the 
GRFl; 2) OT re-spin the HQ; 3) or add a couple 14-pin DIPs to the graphics board. 

The Raster Subsystem consists of: 

• REl Gate Array; This is a large gate array which perfonns scan conversion of endpoints of lines into 
pixels on the screen, and controls all memory timings into 32 bits^ixel of frame buffer and 24 
bits^ixel of Z buffer. The GES loads a bank of registers on the REl chip to indicate what kind of 
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drawing operation is desired. The REl then iterates lines (random or horizontal spans) into individual 
pixels and performs the appropriate write cycles into memory. The chip can flat shade or Gouraud 
shade RGB or color index values while at the same time testing for WID and Z buffer consistencies. 
The chip uses a unique interleaving scheme into the frame buffer and Z buffer memories allow page 
mode access speeds. The REl can stipple random lines, pattern horizontal spans, anti-alias color index 
lines, and ditliCT 12 bit RGB images. 

• Twelve Frame Buffer Bitplanes: Twelve biq)lanes are provided with the base system. Twenty more 
bitplanes may be had with the addition of the BP4 daughter card. An optional ZB3 Z buffer card is also 
available as a plug in. 

The Display Subsystem consists of: 

• XMAP2 Gate Arrays: Five XMAP2 gate airays are used, one for each pixel pipe out of the frame 
buffer. Each accepts up to 32 bits of data from the frame buffer, examines the window ID data and 
overlay data, and outputs the appropriate data to the RAMDACs (through the color maps if necessary). 
The XMAPs are by now rather standard SGI functionality. Refer to the XMAP2 specification for more 
information. Suffice it to say that they allow simultaneous single and double buffering in various 
modes in multiple wmdows, overlays, underlays, and all options chosen on the basis of the contents of 
the window ID bitplanes. 

• Color Maps: Each of the five pixel pipes has a 4K 24-bit word color lookup table. These support up to 
twelve bits of color index that are used as the pointer into the color map to determine the 24 bit value 
that will be passed to the RAMDACs. When only eight image bitplanes are available in the system, the 
upper four bits may be wired on a window by window basis inside the XMAP2. 

• Hardware Cursor: Two Brooktiee Bt431 cursor chips may be inserted into the GRl. Each chip 
contains a 64 by 64 cursor glyph and some X,Y counters which keep track of the current position of the 
monitor beam on the screen. When the monitor beam is at a location where a bit in the cursor glyph is 
active, the Bt431 ouqtuts an active signal to the overlay inputs on the RAMDAC. The RAMDAC will 
turn that pixel on as a cursor biL Two bits of cursor color may be had by installing both cursor chips. 
Refer to Ae chip spec. 

• RAMDACs: Three RAMDACs are used, the Brooktree BT457. one for each of red, green and blue. 
The RAMDACs provide multiplexing of the five pixel pipes down to one data stream, another color 
lookup table for gamma correction of values in this data stream, two overlay inputs to allow the cursor 
to be superimposed on the data stream, and D/A conversion of the data stream into RS343 level signals 
appropriate for the monitor. Refer to the chip spec. 

• The Display State Machine (DSM): The DSM controls the timing of pixel data as it fiows through the S 
pixel pipes, as well as timing for the SYNC and BLANK signals which control the monitors. Which 
timing to use is indicated by two bits that the host may modify in an XMAP display register. The 
timings supported on the standard product are: 1) 1280 by 1024 pixel non-interlaced 60Hz: 2) 1280 by 
1024 pixel interlaced 30Hz; 3) 645 by 485 pixel RS170 30Hz; 4) 780 by 575 pixel EURO 30Hz. There 
are three oscillators that reside on the GRl board that are used to provide the timing. The first provides 
a 107.352 MHz clock to support both 1280 by 1024 modes. The second provides a 12.27 M^ clock 
for RS170, and the third runs at 15.00 MHz to support PAL and SECAM. There is a fourth choice of 
clock which is from the J4 connector. This is typically sourced off the genlock board (described 
below), although it could be used as the clock source f(» some unusual timing mode. 

Fch- a given timing standard, the DSM allows varying the widths of the SYNC and BLANK signals, 
varying the length of a frame buffer line to be displayed (all images less than 1280 pixels wide must be 
left adjusted), varying the number of lines displayed, and choosing between interlaced and non¬ 
interlaced. SYNC and BLANK pulses are adjustable within a 5 pixel resolution. 

The DSM interacts with Raster Subsystem by requesting a data transfer cycle of the REl. Every new 
display line, new data must be loaded into the serial shift register which is part of every video RAM 
chip (from the chip’s dynamic RAM array). This loading is done with a data transfer cycle. The DSM 
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undo^tands when horizontal blanking is going on and asks the R£1 to perform the transf^ from RAM 
to shift register inside the video RAMs during the horizontal blanking period The REl provides the 
proper page address and shift start address for the data row to be displayed next (see video RAM data 
sheets). 

The DSM is built out of a Xilinx 2018 programmable logic array, an 8K by 8-bit PROM, a 2K by 8-bit 
RAM, a Bt438 clock generator chip, and some other miscellaneous clock control circuitry. 

4. INTERFACES 

The GRl incorporates the following interfaces: 

— interface to die host CPU (IP6) 

— interface to the bitplane expansion board <6P4) 

— interface to the Z buffer option board ^3) 

— interface to the Genlock Board (CG3) 

— stereoptic bit and other miscellaneous signals 

4.1 Interface to the Host 

The host interface runs across a ribbon cable between the IP6 and the GRl. 

4.1.1 Hardware Interface All control signals (except VERTSTAT) are buffned on the GRl board before 
being used on or sent off the board. The multiplexed address/data bus goes to five destinations on the GRl 
without being buffered. Worst case capacitance is low, and reasonable time is provided before the data on 
the bus is sampled to allow the data to settle (possible because the bus is synchronous). Four clock signals 
(HOSTJ*RECLK, HOST.IOMHZ, HOST.OE-EN\ HOST.CASSTB\) are passed from the host to the GRl 
to drive the interface and the GE5. The first three of these are actually used by the GRl. The system is 
designed assuming there can be no more than 10ns of skew between the clock signal on the GRl and its 
equivalent clock on the IP6. The amount of skew is determined by the buffer delay on each of the boards 
where they receive the clocks off the interface (the IP6 sends its clocks onto the cable then wraps them 
back through buffers before using them). Figure 2 shows how the buffered clocks should appear on the 
GRl. 

The host interface signals are: 

• HOST-IOMHZ: The host provides the lOMHz clock which drives the GE5. The lOMHz clock is 
distributed to the GES on two lines, one is free running, and the other can be stopped for stalling the 
3132 when the HQl determines a stall is necessary. 

• HOST-PRECLK: This is a lOMHz clock which is shifted approx 15ns ahead of the HOST-IOMHZ. 
This clock is used to ccxnpensate for the possible 10ns skew between clocks on the two boards. All 
signals passed between the host and the graphics board are latched using this clock. 

• HOSTOE-EN\; This is a lOMHz clock which follows HOST-IOMHZ exactly but with a much shorter 
high time. The high time of this signal is used to disable all drivers onto the host interface address/data 
bus, as well as most drivers on the GES data bus. The intent is to eliminate contention among drivers 
on a given bus. 

• HOST-CASSTBV This signal is unused on the GRl. 

• HOST-RST\ A low level on this signal will reset the following items on the GRl: HQl, fifos, GRFl, 
REl. Bt431, Bt438s. XMAP2s. 

• HOST-ASV The address strobe goes low for one clock cycle to initiate a transfer and to indicate a 
valid address is on the bus. 

• HOST-RDV Indicates whether this is a host read or a host write access. 
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• HOST-DLY\: The host assms this signal low after an address strobe to lengthen a transfer. The HQl 
re-clocks HOST-DLY\ before using it. The cycle when both re-clocked HOST-DLY\and IO-DLY\are 
high after an address strobe is the last cycle of the access. 

• IO-DLY\: Hie HQl asserts this signal low after an address strobe to lengthen a transfer. The host does 
NOT re-ciock this signal before using it The cycle when both re-clocked HOST-DLY\and lO-DLTX 
are high after an address strobe is the last cycle of the access. 

• HOST-BURSTS: When the host wishes to perform a burst DMA transfer, and after the GES has 
indicated via software that it is ready to transfer, the host assots this signal low to initiate the actual 
burst transfer. The burst transfo’ continues until this signal goes inactive. 

• FIFOHALFS: This signal goes low whenever the fifo is half full. It is an indication to host software 
that it should hold off sending commands down the graphics pipe. 

• IO.INTR\: This intonipt bit may be set by GES microcode and cleared by a host access. 

• VERTINTRV This interrupt is asserted once every monitor frame slightly ahead of vertical blanking. 
The exact positioning of the interrupt may be controlled by the Display State Machine firmware. 

• VERTSTATc This is a status bit which is asserted during vmical blanking which may be polled by 
host software. The exact positioning of this signal within vertical blanking may be controlled by 
Display State Machine firmware. 

• HOST-AD[0-311: Address and data are multiplexed on these pins. Twenty bits of address are valid 
while the address strobe is active, and 32 bits of data are valid depencUng on the state of HOST-DLY\ 
and lO-DLYN. These lines have puUups on them on the GRl. 

Timing for the host interface handshake signals is given in Figure 3. 

4.12 Software Interface Ten bits of address (HOST-AD[2-ll]) are decoded by the HQl. Two additional 
bits (HOST-AD[12-13]) are decoded by the GRFl to allow for implementation of fixes involving the host 
interface. The twelve address bits are decoded along with the address strobe and the read strobe to 
determine who’s involved in what kind of cycle. Offsets into the graphics address space are given in the 
following table: 
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ADDRESS ADDRESSED 

0 -1023 ucode RAM, low word (read/write) 

(plus MADDRREGr7]=0) 

0 - 1023 ucode RAM, high byte (tead/write) 

^lus MADDRREG(7]=1) 

1024 - 2047 data RAM w/o GRFl fifo fix (read/write) 

(plus MADDRREG[7]=0) 

MADDRREG[7]=1 TOR ACCESSING THE FOLLOWING LOCATIONS: 


1024 -1055 

xmap channel 0 

(readAvrite) 

1056 -1087 

xnu^ channel 1 

(read/write) 

1088-1119 

xmap channel 2 

(read/write) 

1120-1151 

xmap channel 3 

(read/write) 

1152-1183 

xmap channel 4 

(tead/write) 

1184-1215 

xmap broadcast 

(write only) 

1216 -1247 

xmap display teg 3 

(read/write) 

1248-1279 

xmap display teg 4 

(tead/write) 

1280-1311 

red brooktree dac 

(read/write) 

1312 -1343 

green brooktree dac 

(tead/write) 

1344 -1375 

blue brooktree dac 

(read/write) 

1376 - 1407 

cursor chip 0 

(read/write) 

1408 -1439 

cursor chip 1 

(read/write) 

1440 -1471 

xmap display reg 0 

(tead/write) 

1472 -1503 

xmap display teg 1 

(read/write) 

1504 -1535 

xmap display reg 2 

(read/write) 

1600-1663 

clear stall 

(write only) 

1664 -1727 

set single step mode 

(write only) 

1728 -1791 

clear single step mode 

(write only) 

1792 -1855 

execute a single step 

(write only) 

1856 -1919 

read current PC 

(read only) 

1920 -1983 

clear interrupt 0 

(write only) 

1984-2047 

clear interrupt 1 

(write only) 

2048 - 3071 

fifo 

(read/write) 

3072 - 3583 

loadMADDRREG{6-0] 

(write only) 

3584-4095 

loadMADDRREG[7] 

(write only) 

5120-6143 

(plus MADDRREG[7]=0) 

data RAM with GRFl fifo fix (readAvrite) 

8192 

GRFl flag bit 1 

(read/write) 

8196 

GRFl flag bit 2 

(read/write) 


Unused addresses above 4096 should not be accessed. Only addresses 1536 • 1599 are not recognized by 
theHQl. 

42 Interface to BitpUme Expansion 

Ibis interface consists of a bundle of timing control signals sent from the REl to the bitplane expansion 
board (BP4), and both parallel and serial port data buses. 
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42.1 Hardware Interface The following signals are passed across the J8 and J9 connectors: 

• FADDR[0-7]: The multiplexed row and column address lines. 

• L£N[0-3]: These are the serial clock enables which select which of the four rows of video RAMs is 
currently being displayed on the screen. CMy the one of four rows which is currently selected gets it 
shift register clocked. 

• VIDLDD[0-2]: These are the serial shift register clocks which clock data out of the video RAM shift 
registers. These signals are gated on the 6P4 with the LEN signals to produce the actual serial clock to 
the VRAMs, 

• FDATA[8-27]; The parallel port data bus for the 20 bitplanes on the BP4. 

• PIXH[A-E][8-23]: These are the serial data streams providing another sixteen biqjlanes of image data 
totheXMAP2s. 

• AUX[A-E][0-3]: These are the four auxiliary bitplanes that come wifli the BP4. 

• RAS [0-3]\: The row address strobes for the four rows of video RAMs. 

• CAS [0-4]N: The column address strobes for the five columns of video RAMs. 

• OE[0-4]\: The output enables for the five columns of video RAMs. 

• FWEc The VRAM write enable. 

• RECLKEP: The same clock as drives the REl, used to clock certain registered drivers. 

• BPINV This simply indicates to software via an XMAP2 readable register that the BP4 is actually 
installed. 

The timing of these signals is identical to those which drive the bitplanes which come standard on the GRl 
board. Refer to the REl documentation. 

43 Interface to the Z-Buffer Board 

This interface consists of a bundle of timing control signals sent firom the REl to the Z buffer board <ZB3). 
data and address buses. The Z buffer board simply bolds the memory for the Z data, while the actual Z 
compares and decision making goes on inside the REl chip. 

• ZADDR[0-8]: Nine bits of multiplexed row and column address to accomodate the one megabit 
DRAMs which make up the Z buffer. 

• ZDATA[0-23]: Twenty-four bits of Z data. 

• ZRASS: A single row address strobe as the one megabit DRAMs are arranged one row by five 
columns. 

• ZCAS[04]\: The column address strobes for the five columns of DRAMs. 

• O£[0-4]\: The output enables for the five columns of DRAMs. 

• ZWEN: The write enable for the Z buffer DRAMs. 

• ZlG[0-2]\: Output enables fOT the registered transceivers which sit between the REl and the actual 
DRAM devices. 

• ZBINN: This indicates to software via an XMAP2 readable register that the ZB3 is actually installed. 

The timing of these signals is very similar to those which drive the biqtlanes which come standard on the 
GRl board. Refer to the REl documentation for more information. 

4.4 Interface to the Genlock Board (CG3) 

Using the GRl with the genlock board allows synchronous video display between two machines running 
with the same monitor type selected. When in genlock, the source for the pixel clock comes off the CG3 
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board. This pixel clock is derived &om the incoming masts' signal from the other (external) system. Thus, 
the two systems clock at the same rate. To ensure the two systems provide data at the same time, the CG3 
sends a synchroniTation signal, called GENSYNCV, to the GRl. GENSYNCv forces the GRl display state 
machine to a particular spot in its video timing sequence. The CG3 chooses when it asserts G^SYNCN 
such that the video timing sequences match op between the two machines. Genlocking is supported for any 
of the four monitor types. GENSYNCV is a Same rate signal (i.e. NOT field rate for interlaced). Note that 
the GRl horizontal sync should be used by the CG3 board to phase lock the ECL clock that is driving the 
GRl (thus getting the accuracy to within dte phase error of the phase comparator, rumored at less than 1/2 
pbiel). Genlock signals are sent out over the JS connector. The CG3 uses the signals listed below. 

• Composite Vertical and Horizontal Sync 

• Horizontal Drive 

• Composite Vertical and Horizontal Blanking 

• Field Bit indicating whether the current field is the odd or even field. 

• Least significant bit of blue on each of the five pixel pipes. These five bits are used for multiplexing the 
images from the two genlocked machines on a pixel by pixel basis (done on the CG3). 

4 J Stereoptic Bit and Other Miscellaneous 

A bit in an XMAP2 register may be written by software. The bit drives out on the genlock connector and 
may be used by application programs to control sto'eoptic viewers. 

Also driven onto the genlock connector are four diagnostic signals: FIFOEMPTYX • tells if the fifo is 
empty; FIFOHALFX - indicates when the fifo is more than half full; CLKSTALLX shows when the GE5 is 
in a stalled state; RELOADENIX - shows when the REl is idle, i.e. not drawing. 

5. GE5 
See Figure 4. 

The GES is a floating point compute engine controlled by 16K 40-bit words of rniciosmie. The microcode 
word is detailed in Figure 5. The microcode word is either 40 bits wide, or two cycles may be taken to 
access 80 bits of information for a single microinstruction. The 40 bit microinstruction has all the 
information necessary to control the internal functions of the 3132, all PC control, the ability to increment 
the REPTR and MEMPTR, and control over the commonly used data bus output and write enables. Most 
instructions may simply use this single field, including conditional branches which do not get taken. The 
second 40 bit field is mostly used for constants, for instance target addresses for branches and values to 
load in the MEMPTR. Also found in the second field ate tools for setting the interrupt bit and controls for 
the burst DMA channel. 

The HQl includes all the decode circuitry fcr the portions of the microcode field which affect the Program 
Counter (PC), the Data Memory Pointer (MEMPTR), the RE PointCT (REPTR), and the data bus enables. 
The PC, MEMPTR, and REPTR registers reside on the HQl. Inside the HQl is a two stage'pipeline. The 
first stage decodes the current output firom the microcode memory to determine what the next instruction 
will do, how the next instruction will affect the PC, MEMPTR, REPTR and data bus enables. The second 
stage of the pipeline controls the current activity on the data bus and in the 3132, and represents the result 
of the previous cycle’s ouqjut from the microcode memory. Reinterating the pipeline concept, it takes one 
clock cycle before the activity for the current microinstruction is seen on the GES data bus and in the 3132. 

The pipeline may be stalled in its current state by sevoal conditions. First, the HQl provides a single step 
mode whereby only one instruction gets executed every lime the host writes to a particular GRl address. 
Second is the provision of a microinstruction which stalls microcode execution at the command of software 
until the host unstalls the GES. The third condition happens when the microcode is attempting to read a 
value firom the fifo, but the fifo is empty. The microcode will stall with the PC pointing to the 
microinstruction after the fifo read but no output enable will be issued to the fifo. The GES will stay in this 
state until the fifo goes not empty. At this point a fifo output enable will be issued and the GES will be 
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unstalled. The fouilh event which causes a stall which maintains the current state of the pipeline is a read 
or write to the REl register bank when the R£1 is not ready to accept a transfix. The R£1 interface to the 
GE5 includes two banks of 32 registers. The HQl looks at the upper bit of the REPTR which determines 
in which bank the register to be accessed sits. The HQl then conditions the access with the state of either 
RE-LOADENO\ or RE-LOADENl\ (using the formCT if REPTR5 is 0 and the latter if REPTR5 = 1). An 
access where the appropriate RE-LOADENx\ is not low will stall the GE5 pipeline until such a time as it 
does go low. 

The HQl will also stall the GE5 on an access to the GE5 data memory (this is the only part of the address 
space which may be accessed while microcode is actually running; aU other accesses most ensure that the 
GE5 is already stalled via a reset or a stall microinstruction). This differs from the other stalls in that the 
second stage of the pipeline is not maintained in its current state. Instead, it is flushed for the period of the 
host access (which forces MEMOE\ or MEMWE\ and HOSTOE\ active). The second stage of the pipeliiK 
is then restored from save away registers before the HQl unstalls the GE5. 

A stall is evidenced on the GRl by the CLKSTALL signal which is high throughout a stall. This signal is 
brought out on the third LED for immediate viewing. Note also that the second 40 bit field of an 80 bit 
microinstruction is accessed by stalling the 3132 for a single clock. Whenever the second field of a 
microinstruction is accessed, CLKSTALL will go high for a single clock cycle. The HQl is still messing 
with the PC, MEMPTR, REPTR and data bus enables, but the 3132 is kept in its current state for this clock 
tick. 

Data may be passed between any combination of the fifo, the 3132. the data RAM, and the RE, with the 
exception that data may not pass fiom the fifo directly into the RE (reason: a host access will fail if it 
occurs during a transfa' like this when the appropriate R£-LOADENx\ is not active). Typical operation 
has the GES getting its command fix)m the fifo (the PC jumps to the value found on FIFO-TAG shifted left 
once), followed by whatever variables the host has stuffed in the fifo associated with that command. Then 
computations are done inside the 3132 with constants and variables being passed extensively between the 
3132 and the GES data ram. Finally, drawing commands are loaded into the RE register banks to tell the 
RE to scan convert lines into pixels. 

All burst DMA transf^ are controlled by a specially provided microcode instruction called REPEATGEZ. 
For a burst DMA transfa, the microcode enters the HQl into DMAMODE, loads a counter in the HQl 
with the number of transfers about to happen, then executes a REPEATGEZ instruction with the 
ai^ropriate source and destination data bus enables set The same scenario occurs whether the transfer be 
host-ram or host-RE or ram-RE. The only microcode difference is which data bus enables are set The 
HQl (and GRFl as explained below) takes over from there. Depending upon which burst transfer is 
happening, the HQl monitors the HOST-BURSl\ signal and the RE-DLY\ signal to determine when a 
transfer has actually happened. If the transfer is host-RE, any clock cycle where HOST-BURSIX is low 
and RE-DLY\ after being re-clocked is high indicates a transfer has happened and the counter inside the 
HQl is decremented. The host may stall hi the middle of a DMA transfer by asserting HOST-BURSTS, 
and the RE may stall by asserting RE-DLY\ HOST-BURST\is passed through the HQl to the REl on the 
GE5-DLY\ while RE-DLY\ is re-clocked and passed through the HQl to the host as IO-DLY\ The host 
and REl both maintain their own separate counts of the number of transfers left in the current burst. The 
host-ram and ram-RE handshakes are subsets of the host-RE handshake where the signals do not get passed 
through the HQl. 

More detail on the GES and its operation may be found in the HQl specification. 

6. GRFl Fixes 

The GRFl gate array was designed to fix four problems on the original GRl design. The first problem is a 
trouble inside the HQl chip in the event that the HQl is output enabling the fifo at the same time that a host 
accesses GES data ram. llie HQl notes a request has been received from the host to access the data ram, 
and does not service that request until it has finished ouqiutting the fifo. It then p^orms the requested host 
access. What is missing is that the HQl should hold off the host using the IO-DLY\ signal until the fifo 
transaction is completed. Instead, IO-DLY\ remains unchanged in the high state and the host believes the 
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tcansfer to be immediately complete. The fix for this problem requires software to assert HOST-AD12 for 
every access to the GE5 data ram (see address map above). Whenever H0ST-AD12 is high, the GRFl 
chip will automatically assert lO-DLYN until the IO-OLY\from the HQl goes low, at which point the HQl 
IO-DLY\ gets passed straight through. This succeeds in keeping the host waiting until the HQl has 
finished servicing any microcode fifo reads. 

The second problem involves burst DMA transfns which write data to the REl. For burst transfers in this 
direction, the REl does not expect the HQl to re-clock the R£-DLY\ signal before examining its state. 
Therefore, the handshake is off by one clock cycle and the two endpoints of the burst transfer end up out of 
sync with each other. The GRFl chip looks at several board level signals and manages to predict the R£- 
DLY\ signal in such a fashion that it comes out of the GRFl chip one clock ahead of when it comes out of 
the REl. Essentially, the REl burst control state machine is re-created in the GRFl with fixes included. 

The next two problems are both related to burst DMA transfers which read data finom the REl. Burst read 
transfers take two lOMHz clock cycles i<x each datum transfered. The REl produces an RE-DLY\low on 
every other lOMHz clock cycle to draw the transfo* out to two cycles. Since R£-DLY\ is re-clocked 
before being used by either the HQl or the host, it is actually the state after that in which R£-DLY\ is high 
that data is used by the host or the GE5. Now, if GESOLY\ goes low during the state when the data was 
to be used (RE-DLY\ now low), the data should be maintained on the GES bus by the REl until GE5-DLY\ 
goes high again. Instead, the REl does not recognize GE5-DLY\ at all on this cycle and puts the next 
datum out on the bus and decrements its transfer counts*. The REl only recognizes GE5-DLY\ on the 
same cycle as its R£-DLY\ output is high. 

For ram-RE transfers, GE5-DLY\ nevs* goes low until the transfer is finished, so no problem ever occurs. 
For host-RE transfers, the GRFl chip predicts the rising edge of HOST-BURSTN one clock ahead of when 
it actually is received from the host, curing the handshake mismatch. 

The fourth problem, also a consequence of this read DMA handshake mismatch, occurs when the host does 
a regular access to the GES data ram while a ram-RE burst transfer is in progress. The host access will 
cause GE5-DLY\ to go low randomly. TTie initial assertion of GE5-DLY\ may be lost by the REl if it 
occurs on the wrong half of a read DMA transfer, and the REl will incorrectly decrement its transfer 
counter. The only time the host could access the GES data ram while a ram-RE read DMA was in process 
is when the host is testing some flag locations. The GRFl solves the problem by providing two flag bits to 
the host which may be set by microcode using one of the unused MEMFTR add^s bits, and read by the 
host by asserting HOST-AD 13. 

7. REl and Frame Btffer Organization 

The REl is a dense gate array which controls the parallel port timing for the fiame buffer video RAM 
chips, timing for the Z buffer DRAM chips, and interpolates pixels along random-angled or hraizontal scan 
lines. Detailed timing diagrams and explanations for the RAM chip control signals are given in the REl 
chip specification. 

The REl has three ports. The first port is the interface to the GES which holds 64 registers that may be 
accessed by the GES. The second port is to the frame buffer which bolds up to 24 bitplanes of image 
memory, and four bitplanes of overlay/underlay memory. The third port is to the Z buffer which holds 24 
planes of Z buffer memory, and four biqrlanes of window ID (WID) memory. The two possible bitplane 
configurations for the GRl are shown in Figure 6. Frame buffer and Z buffer organization are shown in 
Figure?. 

The REl performs rectangular clears by clearing the 20 pixel word all in one page mode cycle. The Z 
buffer is cleared by actually writing the WID planes 20 pixels at a time, using one of the bits as a 
dirty (i.e. cleared) bit for the Z buffer. Flat fills are performed on the 5 pbrel fiame buffer and Z buffer 
word. Shaded fills are done using an interleaving scheme across the 5 pixel frame and Z buffer words. All 
five chips are put into page mode, with the CAS\ and OE\ of only one of them being asserted at a time in 
synchrony wifo the data for that chip. This allows shaded pixels to be filled one pixel every REl clock 
tick. Z buffered shaded pixels take one pixel every clock tick for pixels that are not written, and approx 
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one pixel every two clock ticks for pixels that are written. 

The R£l takes care of both refireshing the memory, and ensuring that the RASMow time specification is not 
exceeded for long shaded span fills. The Display State Machine sends TRRQv to the ^1 to indicate it 
should perftmn a video RAM data transfer cycle. The REl keeps track of the row address to use during 
this data transfer cycle. 

The REl includes DDA iterators for X, Y, Z, R, G, B. The red iterator may also be inteipreted as a 12-bit 
color index. The GES loads the REl registers with initial values and delta values for X, Y, Z, R, G, B. and 
a count register with the number of pixels to fill. The REl is then issued a command to draw the line (or 
flat shaded screen clear). The REl will add the delta to each of the initial values to determine pixel values 
along the line. Iteration will continue until the count register reaches zero. Z buffer and WID compare 
circuitry is also part of the REl. Writes can be made conditionally on a pixel by pixel basis dependent on 
the values in the Z buffer and WID planes. 

Pattern masking and line stippling may be enabled or disabled within the REl. Uncorrected anti-aliased 
color index lines are supported in REl hardware. Dithering is accomplished for 12 bit RGB images by 
semi-randomly incrementing the RGB nibbles, the randomness being chosen by a hardwired dithering 
table. 

The REl supports burst DMA reads and writes to the frame buffer or the Z buffer. The GES loads the 
transfer count and then issues a read DMA or write DMA command. The REl readies the memory chips 
in page mode, then performs rapid reads or writes while decrementing its transfer count and monitoring the 
GE5-DLY\ line to ensure a transfer should occur. Upon receiving a burst DMA command, the REl 
switches its own clock (via external gates) so that it is running off the same lOMHz clock that drives the 
host and GES. Typically the REl is driven of the PAL 15MHz oscillator. 

8. XMAPls <4 Color Look-Up Tables 

The XMAP2s provide multiplexing for the image bitplane data on a window by window basis, as well as 
allowing for overlays and underlays. For every pixel which enters the XMAP2, the WID is checked 
against a table in the XMAP2 loaded by software over the Display Bus. The table connects a particular 
WID to a certain display mode for that window. A WID may indicate that the window is in color index or 
RGB mode, that it is single or double buffered, that this is a 4-bit color index image or 12-bit RGB image. 
Using the WID, the XMAP2 correctly routes the data it receives firom the bitplanes on a pixel by pixel 
basis. For an RGB pixel, the bitplane data is padded and shifted as necessary, and put out as three 8 bit 
quantities as inputs to the three DAC^, one for red, green, blue. For a color index pixel, the bitplane data is 
padded and shifted if necessary, and put out as the address into the color lookup tables. The XMAP2 also 
monitors data from the auxiliary bitplanes. If some of the overlay or underlay functions are activated, the 
XMAP2 will override the data from the image bitplanes with 24 bits of data from an internal auxiliary 
lookup map (indexed by the value fiom the auxiliary planes). See the XMAP2 spec for further details. 

Color index pixels are passed through the color lookup maps which ouq>ut the 24 bit RGB quantities. The 
color map outputs are tied directly to the RGB outputs of the XMAP2. One critic^ timing is in swapping 
the ouq>ut drive between the color maps and the XMAP2. They are currently allowed to contend. The 
second critical timing is firom driving an address into the color maps to getting their ouq}ut data set up to 
the RAMDAC inputs. 4K entries in the color maps are provided to allow for 12 bit color index images. 
All 4K are free to use by user programs, as the RAMDACs hold enough ram (256 words) to perform 
gamma correction on all pixel values they receive. 

All XMAP2 internal registers and color maps are loaded across the Display Bus. Each XMAP2 includes 5 
general purpose register bits, three writeable by the host, and two mote rea^ble by the host These bits are 
used on the GRl for controlling and monitoring miscellaneous functions on the board. 

9. Hardware Cursor and RAMDACs 

Locations for two Bt431 hardware cursor chips are provided on the GRL The output from each chip, when 
high, will cause the RAMDAC chips to force a value from internal overlay registers to be forced on their 
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outputs instead of the values leceived from the XMAPs. If only one chip is used, only one cursor color is 
available. If two chips are used, three cutsot colors are available because either one of the two outputs 
may be high, or both at once. Inside the Bt431 is storage for a 64 by 64 cursor glyph. Registers maintain 
the current position of the cursor on the screen, and countras maintain the current position of the monitor 
beam in the output pixel stream. When the two match, the Bt431 output will force a cursor overlay. 
Software updates the current cursor position over the Display Bus. 

Three Bt457 RAMDAC chips are included on the GRl, one for the red, green and blue components of 
1280 by 1024 pixel monitor. Each RAMDAC outputs RS343 compatible voltage levels. Inputs to the 
RAMDAC are 5 pixel pipes of 10 bits apiece. Eight bits are the bits of color component which typically 
are translated to the analog ouh>uL The other two bits, when high, force an overlay over the eight bit color 
component value for cursor applications. The forced value comes fipom a tiny lookup table indexed by the 
two input overlay bits. The five pixel pipes are multiplexed down to a single data stream running at the 
monitor rate (107MHz for high resolution monitor). This data stream is then passed through a lookup table 
to produce a different 8 bit output quantity which is subsequently translated to an analog output signal. The 
interim lookup table is used for garruna correcting the values received firom the XMAP2s to correct for 
color nonlinearities in the monitor being used. The gamma correction map and cursor overlay map are 
updated over the Display Bus. 

10. Display State Machine 


( 


May 23,1988 


SGI Confidential 



BOST mTESFACE 31001.8 


I 


FIGURE 1: BLOCK DIAGRAM 



SIGNALS FOR OFIIOHAI, 
Z-BUFFER S EXPANSION 

frame buffer 


GENLOCK SIGNALS 

















































FirWKE 



OS GRl AFTER BOFFERISG 


HOST.IOMHZ 


EOST.PRECLK 


HOST.OE-EN\ 


THE BUFFERS OS THE GRl MEET THE FOLLOSIMG OEIAY 
RESTRICTION ON THE LOH TO HIGH TRANSITIONS: 

6 NS <- BUFFER FROPASATION DELAY 16 NS 


THE GRl PROVIDES ADDRESS AND DATA ON TEE CABLE BETNEEN TEE BOARDS 
AS SHOWN BELOW WITH RESPECT TO THE BUFFERED CLOCK ON THE GRl: 


HOST.IOMHZ 


-Ti_ 

DON'T CARE 


VALID 


HOST.AD* 


Tl < AS NS 
T2 > 4 NS 


THIS GUARANTEES 28 NS SETUP AND 8 MS HOLD WITH RESPECT TO 
THE BUFFERED EOST.PRECLK AS SEEN ON THE HOST BOARD 
(THIS SET UP BEING SEEN ON THE CABLE) . 




























FIGURE 3 


HANDSHAKE 

SIGNAL 

TIMINGS 

























CBS DJtfR ^ 


'3Z 


DIAGRAM 


















FIELD ONE 


39 31 23 



R 








DB- 

SRCl 

E 

P 

T 

1 

DB- 

DEST 

PC-OP 

FCN 

AADDR 

BADDR 


K 

D 








FIELD TWO 


39 




31 

1 23 1 

®R 

s 

*X 

REPIR 

2 

MEM 

PTR 

2 

BPADDR 



FIGURE 5: MICROCODE 



WORD DEFINITION 



















